Name | Version | Summary | date |
flash-attn |
2.8.3 |
Flash Attention: Fast and Memory-Efficient Exact Attention |
2025-08-15 08:28:12 |
pi-flash-attn |
2.8.2 |
Flash Attention: Fast and Memory-Efficient Exact Attention |
2025-08-11 23:22:26 |
causal-conv1d |
1.5.2 |
Causal depthwise conv1d in CUDA, with a PyTorch interface |
2025-07-18 22:15:36 |
quant-matmul |
1.2.0 |
Quantized MatMul in CUDA with a PyTorch interface |
2024-03-20 03:44:36 |
fast-hadamard-transform |
1.0.4.post1 |
Fast Hadamard Transform in CUDA, with a PyTorch interface |
2024-02-13 05:49:17 |
flash-attn-wheels-test |
2.0.8.post17 |
Flash Attention: Fast and Memory-Efficient Exact Attention |
2023-08-13 21:27:09 |
flash-attn-xwyzsn |
1.0.7 |
Flash Attention: Fast and Memory-Efficient Exact Attention |
2023-06-01 03:53:40 |